Linguistic Classification Using Instance-Based Learning

نویسندگان

چکیده

Abstract Traditionally, linguists have organized languages of the world as language families, such Indo-European, Dravidian and Sino-Tibetan. Within Indo-European family, they further into sub-families Germanic, Celtic Indo-Iranian. They do this by looking at similar-sounding words across commonality rules word formation sentence construction. In work, we make use computational approaches that are more scalable. More importantly, contest tree-based structure family models follow, which feel is rather constraining comes in way natural discovery relationships between any two languages. For example, affinity Sanskrit has with Irish, Iranian or English better illustrated using a network model. Similarly, Indian inter-relationships go beyond confines Indo-Aryan divide. To enable languages, paper, instance-based learning techniques to assign labels words. Our approach comprises building corpus then applying clustering construct training set. Following this, vocalized classified making custom linguistic distance metric. We considered seven namely Kannada, Marathi, Punjabi, Hindi, Tamil, Telugu Sanskrit. believe our work potential usher new era linguistics India.KeywordsLinguisticsAryan invasion theoryOut India theorySoundex scoreInstance-based learningKNNDBSCANClusteringClustering coefficient

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiple Instance Learning-Based Birdsong Classification Using Unsupervised Recording Segmentation

Traditional techniques for monitoring wildlife populations are temporally and spatially limited. Alternatively, in order to quickly and accurately extract information about the current state of the environment, tools for processing and recognition of acoustic signals can be used. In the past, a number of research studies on automatic classification of species through their vocalizations have be...

متن کامل

Weighted Instance-Based Learning Using Representative Intervals

Instance-based learning algorithms are widely used due to their capacity to approximate complex target functions; however, the performance of this kind of algorithms degrades significantly in the presence of irrelevant features. This paper introduces a new noise tolerant instance-based learning algorithm, called WIB-K, that uses one or more weights, per feature per class, to classify integer-va...

متن کامل

Supervised Learning Using Instance-based Patterns

This paper introduces a new classification algorithm of the instance-based learning type. Training records are converted into patterns associated with a known class label, and stored permanently into a trie1-like tree structure along with other helpful information. Classifying new records is done selecting from the trie two best patterns as solutions hypotheses. Best pattern selection is done u...

متن کامل

Learning Instance Specific Distance for Multi-Instance Classification

Multi-Instance Learning (MIL) deals with problems where each training example is a bag, and each bag contains a set of instances. Multi-instance representation is useful in many real world applications, because it is able to capture more structural information than traditional flat single-instance representation. However, it also brings new challenges. Specifically, the distance between data ob...

متن کامل

Multiresolution Instance-Based Learning

Instance-based learning methods explicitly remem­ ber all the data that they receive They usually have no training phase and only at prediction time do they perform computation Then they take a query search the database for similar datapoints and build an on-line local model (such as a local average or local regression) with which to predict an output value In this paper we review the advantage...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture notes on data engineering and communications technologies

سال: 2022

ISSN: ['2367-4520', '2367-4512']

DOI: https://doi.org/10.1007/978-981-16-9113-3_63